FILTER MODE ACTIVE

#model merging

Records found: 2

#model merging11/09/2025

mmBERT Unveiled: 3T Tokens, 1,833 Languages, and a 2–4× Speed Boost for Multilingual Encoding

'mmBERT is an encoder pretrained on 3 trillion tokens across 1,833 languages that runs 2–4× faster than previous multilingual encoders and supports 8k token contexts. It combines annealed language learning, inverse masking, and model merging to boost performance on high-resource and low-resource benchmarks.'

READ →

#model merging03/07/2025

DeepSeek R1T2 Chimera: Revolutionizing LLMs with 200% Speed Boost and Enhanced Reasoning

DeepSeek-TNG introduces R1T2 Chimera, a new Assembly-of-Experts LLM that delivers twice the speed of R1-0528 and improved reasoning, available now under MIT license.

READ →